8a7c1bf7 |
1 | In Fall of 2006 I did a small project on Metaobject Protocols for my |
2 | CS 331 class. Here lie my notes which may perhaps be useful to |
3 | others. I hope to expand them into something more useful over time. |
4 | |
5 | * Background |
6 | |
7 | ** Object Protocols |
8 | |
9 | An object protocol is a set of methods and specification of the |
10 | interactions between the methods which provide some generic behavior |
11 | (e.g. of a sequence) that are then implemented by classes which |
12 | conform to the protocol (e.g. a vector or list). In most object |
13 | systems a class contains both the methods which implement a protocol |
14 | and the data used by the implementation. The intent is to emulate |
15 | state machines which pass messages between each other. |
16 | |
17 | ** CLOS Way of OO |
18 | |
19 | The Common Lisp Object System (CLOS) is different. It separates |
20 | the data and method concepts into classes and generics. A class |
21 | contains data fields only, and a generic has methods specialized for |
22 | certain types attached to it. This seems a bit weird at first, but is |
23 | significantly more powerful as it encourages complete encapsulation |
24 | through its use of classes primarily for method specialization rather |
25 | than for state storage. |
26 | |
27 | |
28 | *** Classes for scratch data and types |
29 | |
30 | In CLOS classes store data in slots (which are the same as data |
31 | members). Encapsulation is not provided; any bit of code can use |
32 | =slot-value= to access or set the value of a slot. This may seem odd at |
33 | first, but encapsulation is of questionable importance as the slots |
34 | are meant only to be used by the protocol defined around the class. |
35 | |
36 | Classes are defined with defclass |
37 | |
38 | <src lang="lisp"> |
39 | (defclass name (superclasses ...) |
40 | ((slot-name :accessor slot-accessor ...) |
41 | ...) |
42 | (class-options ...)) |
43 | |
44 | (defclass example () |
45 | ((foo :accessor foo-of :initform 5))) |
46 | |
47 | (defclass example-child (example) |
48 | ((bar :accessor bar-of :initform (list 1 2 3)))) |
49 | </src> |
50 | |
51 | Slot defintions have several option; the above example shows only the |
52 | =:accessor= and =:initform= options which are the most commonly |
53 | used. =:accessor= generates an accessor for the slot (e.g. if you have |
54 | an instance of =example= you can =(setf (foo-of some-example-instance) 'some-value)= to set and =(foo-of some-example-instance)= to access the |
55 | value). =:initform= provides a default initial value for the slot as a |
56 | symbolic expression to be evaluated when an instance is created. |
57 | |
58 | *** Generics with methods that implement protocols |
59 | |
60 | Generics are like normal functions in Lisp, but they only provide a |
61 | lambda list (parameter list). Methods are added to the generic which |
62 | specialize on the types of their parameters, and provide the actual |
63 | implementation. This allows for rich layered protocols to be developed |
64 | which can enable selective modification of individual facets with |
65 | minimal code. |
66 | |
67 | <src lang="lisp"> |
68 | (defgeneric generic (parameters ...) |
69 | (options) ...) |
70 | |
71 | (defmethod generic-name ((parameter type) parameter ...) |
72 | "documentation string" |
73 | body) |
74 | |
75 | (defgeneric foo (bar baz quux) |
76 | (:documentation "Process the baz with the quux capacitor to make the |
77 | foo widget fly into the sky at warp speed")) |
78 | |
79 | (defmethod foo ((bar example) baz (quux capacitor)) |
80 | (launch bar (process-with quux baz))) |
81 | </src> |
82 | |
83 | A method lambda list differs from a normal lambda list only in that it |
84 | can specify the type of the parameter using the notation =(name type)=. |
85 | Note also that methods can specialize on the types of every |
86 | argument and not just the first one. This is quite powerful for |
87 | reasons outside of the scope of this presentation. |
88 | |
89 | * Limitations of Default Language Behavior |
90 | |
91 | The behavior of a language is a compromise between many competing |
92 | issues that attempts to be as generally useful as possible, and most |
93 | applications will have no issue with the default behavior. There are, |
94 | however, certain applications that could be cleanly written with minor |
95 | modifications to the behavior of the language, but would be impossible |
96 | or quite difficult to write otherwise. |
97 | |
98 | ** Slot Storage |
99 | |
100 | Most languages choose to preallocate storage for all of the slots of |
101 | an instance. Imagine a contact database that stored information about |
102 | people as slots of the class. There may be dozens of slots, but often |
103 | many of them will be left blank. If slot storage is preallocated much |
104 | memory will be wasted and the system may not be able to fit into the |
105 | memory of the hardware it must run on (perhaps for financial reasons, |
106 | huge datasets, etc.). |
107 | |
108 | To save memory the author of the contact database must implement his |
109 | own system to store properties and allocate them lazily. This |
110 | represents a fair bit of effort, and would implement a system that |
111 | differed from the existing slot system of classes only in the method |
112 | of data storage. |
113 | |
114 | It would be useful if there were a way to customize instance |
115 | allocation. The customizations would be minor and require overriding |
116 | only the initial allocation behavior and the behavior of the first |
117 | assignment to the slot. It is a a trivial problem in a language that |
118 | allows customization of these. |
119 | |
120 | ** Design Patterns |
121 | |
122 | Design Patterns are generalized versions of common patterns found in |
123 | programs. Many of them are merely methods to get around deficiencies |
124 | in the language, and can be quite messy to implement in some |
125 | languages. |
126 | |
127 | * Metasoftware |
128 | |
129 | Some types of programs could be written easily if the language were |
130 | customizable, but are nearly impossible to write when it is not. |
131 | |
132 | ** Runtime Generated Classes |
133 | |
134 | Say you wanted to write a video game where players could create their |
135 | own objects, attach behaviors to the objects, and perhaps mix |
136 | different objects together to create new ones. When you abstract the |
137 | problem this looks just like an object system! Wouldn't it be nice if |
138 | your program could create new objects and methods on the fly portably? |
139 | |
140 | ** Object Inspection |
141 | |
142 | Imagine if you were developing a complicated program with many |
143 | different objects that interacted in fairly complex ways. A tool to |
144 | inspect the structure of objects while debugging would be quite |
145 | useful, but in a traditional language would be impossible to implement |
146 | portably. This could force you to purchase a certain compiler |
147 | implementation which provided one, and even then would more than |
148 | likely not be customizable. |
149 | |
150 | This problem can be generalized to apply to most debugging tools; it |
151 | would be useful to write such tools portably because users of the |
152 | *language* and not the *compiler* need to debug software. Sharing |
153 | infrastructure would result in better tools (more developers), and |
154 | save man-years of wasted effort that comes with having to rewrite |
155 | non-portable functionality from scratch multiple times. |
156 | |
157 | * Metaobject Protocols |
158 | |
159 | ** Limited/Generalized Internals of the Implementation |
160 | |
161 | A Metaobject protocol is a generalized and limited subset of the |
162 | underlying implementation of the language. It is generalized and |
163 | limited in scope to allow for multiple implementation strategies; |
164 | this, along with careful design, is essential because programming |
165 | language research is ever advancing and new techniques for creating |
166 | more reliable and faster implementations are still being discovered. |
167 | |
168 | This subset of the implementation is exported as a set of methods on |
169 | metaobjects. Thus the system is implemented in itself. The system can |
170 | then be customized using the extension and overriding features of the |
171 | system. |
172 | |
173 | ** Classes of MOPs |
174 | |
175 | *** Reflective |
176 | |
177 | A reflective MOP provides a functional/procedural interface to |
178 | information about the system. It exposes class relationships, the |
179 | methods are attached to a generic, etc. A reflective MOP often |
180 | provides some functionality for creating new classes at runtime. |
181 | |
182 | **** Example: Object Inspector |
183 | |
184 | **** Example: Runtime Generated Classes and Methods |
185 | |
186 | *** Intercessory |
187 | |
188 | Intercessory MOPs allow the user to customize language behavior by |
189 | implementing methods which override certain aspects of the language |
190 | behavior. This class of MOPs are what make MOPs especially |
191 | powerful. No longer must a problem be restructured to fit the |
192 | implementation language; the underyling language can be reshaped to |
193 | fit the task at hand, and obfuscation of the intended structure of the |
194 | application can be avoided. |
195 | |
196 | **** Example: Lazily Allocated Slots |
197 | |
198 | **** Example: Observer Design Pattern |
199 | |
200 | ** Violation of Encapsulation? |
201 | |
202 | A MOP may seem like a violation of encapsulation by revealing some |
203 | implementation details, but in reality a well designed protocol does |
204 | not reveal anything which was not already exposed. Implementation |
205 | decisions affect users, and some of these details do leak through to |
206 | higher levels (e.g. the memory layout of slots). Implicit in the |
207 | protocol specification are these implementation details, and the MOP |
208 | merely makes this limited subset available for customization. |
209 | |
210 | A MOP makes it possible to customize certain implementation decisions |
211 | that do not **radically** alter the behavior of the base language. The |
212 | conceptual vocabulary of the system retains its meaning, and so code |
213 | written in one dialect can interact with code written in another |
214 | without knowing that they speak different ones. |
215 | |
216 | * MOP Design Principles |
217 | |
218 | ** Layered Protocol |
219 | |
220 | A layered protocol design is good for both meta and normal object |
221 | protocols, and enables a combinatorial explosion of customizations to |
222 | the protocol. |
223 | |
224 | *** Top level **must** call lower level functions |
225 | |
226 | The top level methods of a layered metaobject protocol are required to |
227 | call certain methods to perform some tasks. This both makes it easier |
228 | to customize the top level methods (which perform very broad tasks) by |
229 | providing some pieces of them for the programmer, and allows more |
230 | customization to be done by opening up the replacement of lower level |
231 | functions as a way to alter a small detail of the high level behavior. |
232 | |
233 | *** Lower level methods are easier to customize |
234 | |
235 | The lower level methods of a MOP are limited in scope and can be |
236 | implemented easily. Often the changes to language behavior that are |
237 | wanted are very small, and having methods that perform simple tasks |
238 | which are often customized reduces the effort required to extend the |
239 | system. |
240 | |
241 | ** Functional Where Possible |
242 | |
243 | Functional protocols are preferred for MOPs (and object protocol in |
244 | general). Functional protocols open up several optimizations for the |
245 | implementation without burdening the user of the protocol. |
246 | |
247 | *** Memoization |
248 | |
249 | Memoization is the process of saving the results of a function call |
250 | for future use. This avoids expensive recomputation of values which |
251 | have not changed (recall that a true function will always return the |
252 | same result when given the same arguments). |
253 | |
254 | A functional MOP can be optimized easily by exploiting this property |
255 | to memoize the return values of calls to expensive operations. A MOP |
256 | must be be very fast to avoid making programs unusably slow, and |
257 | memoization is able to give an appreciable speedup in many cases |
258 | without an insignificant burden on memory usage. |
259 | |
260 | **** Constant Shared Return Values |
261 | |
262 | Disallowing the modification of values returned by protocol methods |
263 | allows the implementation to return large data structures by reference |
264 | to avoid expensive copying without having to do expensive data |
265 | integrity checks. |
266 | |
267 | *** Cleaner Code |
268 | |
269 | ** Procedural Only Where Neccesary |
270 | |
271 | Some operations like method invocation are inheretly stateful and so |
272 | must use a procedural protocol. There is no benefit to be gained from |
273 | using a functional protocol, and indeed an attempt would result in |
274 | obtuse code that severely restricted the implementor. Do note that |
275 | only a very small part of method invocation is stateful (the actual |
276 | call), and most of it can be implemented functionally (e.g. computing |
277 | the discriminating function). |
278 | |
279 | * Examples |
280 | |
281 | ** Object Inspector |
282 | |
283 | A primitive portable object inspector is presented here. |
284 | |
285 | <src lang="lisp"> |
286 | (defgeneric example-inspect (instance) |
287 | (:documentation "Simple object inspector using CLOS MOP")) |
288 | |
289 | (defmethod example-inspect ((instance t)) |
290 | (format t "Simple Object~% Value: ~S~%" instance)) |
291 | |
292 | (defmethod example-inspect ((instance standard-object)) |
293 | (let ((class (class-of instance))) |
294 | (format t "Class: ~S, Superclasses: ~S~%" |
295 | (class-name class) |
296 | (mapcar #'class-name |
297 | (class-precedence-list class))) |
298 | (let ((slot-names (mapcar #'slot-definition-name |
299 | (class-slots class)))) |
300 | (format t "Slots: ~%~{ ~S~%~}" slot-names) |
301 | (inspect-loop slot-names instance #'example-inspect)))) |
302 | |
303 | (defun inspect-loop (slots instance inspector) |
304 | (format t "Enter slot to inspect or :pop to go up one level: ") |
305 | (finish-output) |
306 | (let* ((slot (read)) |
307 | (found-slot (member slot slots))) |
308 | (cond (found-slot |
309 | (funcall inspector (slot-value instance slot)) |
310 | (funcall inspector instance)) |
311 | ((eq slot :pop) t) |
312 | (t |
313 | (format t "~S is invalid. Valid slot names: ~S~%" |
314 | slot |
315 | slots) |
316 | (inspect-loop slots instance inspector))))) |
317 | </src> |
318 | |
319 | ** Observer Design Pattern |
320 | |
321 | A simple implementation of the observer pattern is under 100 lines, |
322 | and the user level code requires only a single line of code to make |
323 | any existing class observable. |
324 | |
325 | In a language lacking a MOP, implementing the observer pattern |
326 | requires modifying every accessor of a class to explicitly invoke any |
327 | observers, and neccesitates the addition of a mixin class to the class |
328 | heirarchy. The fact that an object can be observed is a meta property |
329 | of the class, and forcing it to be implemented at the application |
330 | level dirties the inheritance heirarchy and adds uneccesary meta |
331 | details to the program. |
332 | |
333 | <src lang="lisp"> |
334 | ;;; This metaclass adds a slot to instances which use it, and so the |
335 | ;;; system is defined in its own package to avoid name conflicts |
336 | (defpackage :observer |
337 | (:use :cl #+sbcl :sb-mop) |
338 | (:export observable register-observer unregister-observer)) |
339 | |
340 | (in-package :observer) |
341 | |
342 | ;;; Metaclass |
343 | (defclass observable (standard-class) |
344 | () |
345 | (:documentation "Metaclass for observable objects")) |
346 | |
347 | (defmethod compute-slots ((class observable)) |
348 | "Add a slot for storing observers to observable instances" |
349 | (cons (make-instance 'standard-effective-slot-definition |
350 | :name 'observers |
351 | :initform '(make-hash-table) |
352 | :initfunction #'(lambda () (make-hash-table))) |
353 | (call-next-method))) |
354 | |
355 | (defmethod validate-superclass ((class observable) |
356 | (super standard-class)) |
357 | t) |
358 | |
359 | (defun register-observer (instance slot-name key closure) |
360 | (register-observer-with-class (class-of instance) |
361 | instance |
362 | slot-name |
363 | key |
364 | closure)) |
365 | |
366 | (defun unregister-observer (instance slot-name key) |
367 | (unregister-observer-with-class (class-of instance) |
368 | instance |
369 | slot-name |
370 | key)) |
371 | |
372 | (defun get-observers (instance slot-name) |
373 | (get-observers-with-class (class-of instance) |
374 | instance |
375 | slot-name)) |
376 | |
377 | (defun add-observer-table (instance slot-name) |
378 | (setf (gethash slot-name (slot-value instance |
379 | 'observers)) |
380 | (make-hash-table))) |
381 | |
382 | (defgeneric register-observer-with-class (class instance slot-name key closure)) |
383 | (defgeneric unregister-observer-with-class (class |
384 | instance |
385 | slot-name |
386 | key)) |
387 | |
388 | (defmethod register-observer-with-class ((class observable) |
389 | instance |
390 | slot-name |
391 | key |
392 | closure) |
393 | (setf (gethash key |
394 | (or (gethash slot-name |
395 | (slot-value instance 'observers)) |
396 | ;; Lazily add observer hash tables |
397 | (add-observer-table instance slot-name))) |
398 | closure)) |
399 | |
400 | (defmethod unregister-observer-with-class ((class observable) |
401 | instance |
402 | slot-name |
403 | key) |
404 | (remhash key (gethash slot-name |
405 | (slot-value instance 'observers)))) |
406 | |
407 | (defmethod get-observers-with-class ((class observable) |
408 | instance |
409 | slot-name) |
410 | (gethash slot-name (slot-value instance 'observers))) |
411 | |
412 | (defmethod (setf slot-value-using-class) :before (new-value |
413 | (class observable) |
414 | instance |
415 | slot) |
416 | (let ((slot-name (slot-definition-name slot))) |
417 | (if (not (eq 'observers slot-name)) |
418 | (let ((observers |
419 | (get-observers instance (slot-definition-name slot)))) |
420 | (if observers |
421 | (maphash #'(lambda (key observer) |
422 | (funcall observer |
423 | (if (slot-boundp instance slot-name) |
424 | (slot-value instance slot-name) |
425 | nil) |
426 | new-value)) |
427 | observers)))))) |
428 | </src> |
429 | |
430 | ** Real World |
431 | *** [[http://common-lisp.net/project/ucw/][UCW]] and [[http://common-lisp.net/project/bese/arnesi.html][Arnesi]] |
432 | |
433 | Arnesi uses the CLOS MOP to implement methods which are transparantly |
434 | rewritten into continuation passing style. This allows their execution |
435 | to be suspended at certain points and resumed later. UCW builds on top |
436 | of this to support a web framework where the statelessness of http is |
437 | hidden from the user; displaying a page suspends the execution of the |
438 | current continuation, and resumes it upon submission. The user level |
439 | code is completely unaware of this. |
440 | |
441 | *** [[http://clsql.b9.com][CLSQL]] |
442 | |
443 | CLSQL uses the reflective part of the CLOS MOP to map Common Lisp data |
444 | types into SQL types, and the intercessory protocol for slot |
445 | allocation to map slots onto database columns or sql expressions (for |
446 | implementing relational slots). |
447 | |
448 | *** [[http://common-lisp.net/project/elephant/][Elephant]] |
449 | |
450 | Elephant uses the CLOS MOP to transparantly store any class to disk |
451 | and handle paging between the disk store and memory efficiently and |
452 | with no user intervention. |
453 | |
454 | * Sources & Further Reading |
455 | |
456 | ** Sources |
457 | |
458 | *** The Art of the Metaobject Protocol |
459 | **** Kiczales, Gregor et al. MIT Press 1991 |
460 | |
461 | Highly recommended reading even if you plan to never implement a MOP |
462 | or use the CLOS one. The design principles it recommends are quite |
463 | useful. |
464 | |
465 | *** [[http://www.lisp.org/mop/contents.html][CLOS MOP Specification]] |
466 | |
467 | Specification of the MOP for CLOS defined in *The Art of the Metaobject Protocol*. |
468 | |
469 | *** [[http://citeseer.ist.psu.edu/399658.html][Metaobject Protocols: Why We Want Them and What Else They Can Do]] |
470 | |
471 | A short overview of MOP design principles followed by three example |
472 | metaobject protocols for Scheme. |
473 | |
474 | *** [[http://www2.parc.com/csl/groups/sda/projects/oi/towards-talk/transcript.html][Why Are Black Boxes so Hard to Reuse?]] |
475 | |
476 | Transcription of a talk on the benefits of open implementations of |
477 | software. It first discusses several problems that black box software |
478 | implementations pose, and then presents existing solutions. It shows |
479 | how the existing solutions are insufficient, and then provides |
480 | metaobject protocols as a solution to most of the problems. |
481 | |
482 | ** Further Reading |
483 | |
484 | *** [[http://citeseer.ist.psu.edu/chiba95metaobject.html][A Metaobject Protocol for C++]] |
485 | |
486 | Example of a purely compile time MOP. It implements the functionality |
487 | of a code walker and something similar to the Lisp macro system. |
488 | |
489 | *** [[http://www.parc.com/csl/groups/sda/publications/papers/Kiczales-TUT95/for-web.pdf][Open Implementations and Metaobject Protocols]] |
490 | |
491 | It is a bit long, but it seems to follow a similar structure to AMOP |
492 | in introducing MOPs and their usefulness. The pages are slides with |
493 | notes, and so the 331 pages might not actually take that long to read. |